SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY∗ By
نویسندگان
چکیده
Princeton University and Colorado State University Ultrahigh dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv (2008) propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screening property within the context of the linear model with Gaussian covariates and responses. In this paper, we propose a more general version of the independent learning with ranking the maximum marginal likelihood estimates or the maximum marginal likelihood itself in generalized linear models. We show that the proposed methods, with Fan and Lv (2008) as a very special case, also possess the sure screening property with vanishing false selection rate. The conditions under which that the independence learning possesses a sure screening is surprisingly simple. This justifies the applicability of such a simple method in a wide spectrum. We quantify explicitly the extent to which the dimensionality can be reduced by independence screening, which depends on the interactions of the covariance matrix of covariates and true parameters. Simulation studies are used to illustrate the utility of the proposed approaches. In addition, we ∗Supported in part by Grant NSF grants DMS-0714554 and DMS-0704337. The bulk of the work was conducted when Rui Song was a postdoctoral research fellow at Princeton University. The authors would like to thank the associate editor and two referees for their constructive comments that improve the presentation and the results of the paper. AMS 2000 subject classifications: Primary 68Q32, 62J12; secondary 62E99, 60F10
منابع مشابه
Sure Independence Screening with NP-dimensionality
Ultrahigh dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. A simple and effective method is the correlation screening. For generalized linear models, we propose a more general version of the independent learning with ranking the maximum marginal likelihood estimates or the maximum marginal likelihood itself. We ...
متن کاملSure Independence Screening in Generalized Linear Models with Np-dimensionality1 By
Ultrahigh-dimensional variable selection plays an increasingly important role in contemporary scientific discoveries and statistical research. Among others, Fan and Lv [J. R. Stat. Soc. Ser. B Stat. Methodol. 70 (2008) 849–911] propose an independent screening framework by ranking the marginal correlations. They showed that the correlation ranking procedure possesses a sure independence screeni...
متن کاملNonparametric Independence Screening in Sparse Ultra-High Dimensional Additive Models.
A variable screening procedure via correlation learning was proposed in Fan and Lv (2008) to reduce dimensionality in sparse ultra-high dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Our nonparametric independence screening is called NIS...
متن کاملFeature Screening via Distance Correlation Learning.
This paper is concerned with screening features in ultrahigh dimensional data analysis, which has become increasingly important in diverse scientific fields. We develop a sure independence screening procedure based on the distance correlation (DC-SIS, for short). The DC-SIS can be implemented as easily as the sure independence screening procedure based on the Pearson correlation (SIS, for short...
متن کامل6 Sure Independence Screening for Ultra - High Dimensional Feature Space ∗
High dimensionality is a growing feature in many areas of contemporary statistics. Variable selection is fundamental to high-dimensional statistical modeling. For problems of large or huge scale pn, computational cost and estimation accuracy are always two top concerns. In a seminal paper, Candes and Tao (2007) propose a minimum l1 estimator, the Dantzig selector, and show that it mimics the id...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010